Conference Proceedings

Part-Aware Unified Representation of Language and Skeleton for Zero-Shot Action Recognition

A Zhu, Q Ke, M Gong, J Bailey

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | Published : 2024

Abstract

While remarkable progress has been made on supervised skeleton-based action recognition, the challenge of zero-shot recognition remains relatively unexplored. In this pa-per, we argue that relying solely on aligning label-level se-mantics and global skeleton features is insufficient to effectively transfer locally consistent visual knowledge from seen to unseen classes. To address this limitation, we intro-duce Part-aware Unified Representation between Language and Skeleton (PURLS) to explore visual-semantic alignment at both local and global scales. PURLS introduces a new prompting module and a novel partitioning module to gen-erate aligned textual and visual representations across dif-fere..

View full abstract

University of Melbourne Researchers